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Statistical  Analysis  of  the  Nonhomogeneity  Detector 
for  Non-Gaussian  Interference  Backgrounds 

Muralidhar  Rangaswamy,  Senior  Member,  IEEE 


Abstract — We  derive  the  nonhomogeneity  detector  (NHD)  for 
non-Gaussian  interference  scenarios  and  present  a  statistical 
analysis  of  the  method.  The  non-Gaussian  interference  scenario 
is  assumed  to  be  modeled  by  a  spherically  invariant  random 
process  (SIRP).  We  present  a  method  for  selecting  representative 
(homogeneous)  training  data  based  on  our  statistical  analysis  of 
the  NHD  for  finite  sample  support  used  in  covariance  estimation. 
In  particular,  an  exact  theoretical  expression  for  the  NHD  test 
statistic  probability  density  function  (PDF)  is  derived.  Perfor¬ 
mance  analysis  of  the  NHD  is  presented  using  both  simulated 
data  and  measured  data  from  the  multichannel  airborne  radar 
measurement  (MCARM)  program.  A  performance  comparison 
with  existing  NHD  approaches  is  also  included. 

Index  Terms — EM  algorithm,  GIP,  goodness-of-fit  test,  maxi¬ 
mally  invariant  statistic,  ML  estimate,  NAMF,  NHD,  non-Gaussian 
interference,  SIRP,  type-I  error. 


I.  Introduction 

AN  important  issue  in  space-time  adaptive  processing 
(STAP)  for  radar  target  detection  is  the  formation  and 
inversion  of  the  covariance  matrix  underlying  the  disturbance. 
In  practice,  the  unknown  interference  covariance  matrix  is 
estimated  from  a  set  of  independent  identically  distributed  (iid) 
target-free  training  data,  which  is  assumed  to  be  representative 
of  the  interference  statistics  in  a  cell  under  test.  Frequently,  the 
training  data  is  subject  to  contamination  by  discrete  scatterers 
or  interfering  targets.  In  either  event,  the  training  data  becomes 
nonhomogeneous.  As  a  result,  it  is  not  representative  of  the 
interference  in  the  test  cell.  Hence,  standard  estimates  of  the 
covariance  matrix  from  nonhomogeneous  training  data  result 
in  severely  under-nulled  clutter.  Consequently,  constant  false 
alarm  rate  (CFAR)  and  detection  performance  suffer,  Signifi¬ 
cant  performance  improvement  can  be  achieved  by  employing 
pre-processing  to  select  representative  training  data. 

The  problem  of  target  detection  using  improved  training 
strategies  has  been  considered  in  [1] — [5].  The  impact  of 
training  data  nonhomogeneity  on  STAP  performance  is  consid¬ 
ered  in  [5]— [8].  The  works  of  [1]— [4],  [8],  [9]  have  addressed 
the  use  of  the  nonhomogeneity  detector  (NHD)  based  on  the 
generalized  inner  product  (GIP)  measure  for  STAP  problems 
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involving  Gaussian  interference  scenarios.  This  work  was 
extended  significantly  in  [10]  and  [11]  to  include  the  effects 
of  finite  sample  support  used  for  covariance  matrix  estimation. 
However,  the  corresponding  problem  for  non-Gaussian  inter¬ 
ference  scenarios  has  received  limited  attention.  This  is  due  to 
the  fact  that  tractable  models  for  correlated  non-Gaussian  inter¬ 
ference  have  become  available  only  in  recent  work  [12]— [14]. 

In  general,  nonhomogeneity  of  training  data  is  caused  by 
environmental  factors,  such  as  the  presence  of  strong  discrete 
scatterers,  dense  target  environments,  nonstationary  reflectivity 
properties  of  the  scanned  area,  and  radar  system  configurations 
such  as  conformal  arrays,  and  bistatic  geometries.  A  variety  of 
robust  adaptive  signal  processing  methods  to  combat  specific 
types  of  nonhomogeneities  have  been  developed  in  [15]— [19]. 

In  this  paper,  we  concern  ourselves  with  the  problem  of 
training  data  nonhomogeneity  caused  by  dense  target  envi¬ 
ronments  and  present  the  NHD  for  non-Gaussian  interference 
scenarios.  More  specifically,  two  p-tuple  random  vectors  xt 
and  xs  having  covariance  matrices  Rt  and  R„,  respectively, 
are  defined  to  be  nonhomogeneous  if  R73R(  vl,  where  I 
denotes  the  p  x  p  identity  matrix,  and  v  is  an  arbitrary  positive 
scale  factor.  In  other  words,  the  random  vectors  are  defined  to 
be  nonhomogeneous  if  they  do  not  share  the  same  covariance 
structure.  This  issue  can  be  readily  treated  by  examining  the 
eigenvalues  of  R^Rt  when  R(  and  Rs  are  known.  How¬ 
ever,  R;  and  Rs  are  seldom  known  in  practice,  rendering  the 
eigenvalue  analysis  infeasible.  Therefore,  we  concern  ourselves 
with  the  problem  of  obtaining  the  NHD  test  for  non-Gaussian 
interference  scenarios,  where  the  covariance  matrix  is  estimated 
from  finite  training  data  support. 

Specifically,  we  derive  the  NHD  for  non-Gaussian  interfer¬ 
ence  scenarios,  which  can  be  modeled  by  spherically  invariant 
random  processes  (SIRP)  and  present  a  statistical  analysis  of  the 
resultant  NHD  test.  Section  II  presents  the  relevant  mathemat¬ 
ical  preliminaries.  In  Section  III,  we  discuss  the  issues  of  co- 
variance  matrix  estimation  using  finite  data  as  well  as  the  use  of 
a  maximally  invariant  test  statistic  for  the  NHD.  Furthermore, 
we  present  a  statistical  analysis  of  the  NHD  and  show  that  a 
formal  goodness-of-fit  test  can  be  constructed  for  selecting  ho¬ 
mogeneous  training  data.  The  basis  of  our  NHD  strategy  lies 
in  characterizing  the  statistics  pertaining  to  homogeneous  SIRP 
clutter  scenarios  and  rejecting  realizations  departing  from  these 
statistics.  Performance  analysis  is  discussed  in  Section  IV.  A 
performance  comparison  with  existing  NHD  tests  is  also  in¬ 
cluded  therein.  Conclusions  and  future  research  directions  are 
outlined  in  Section  V. 

In  general,  the  problem  of  nonhomogeneity  detection  for 
SIRPs  is  complicated  by  the  fact  that  the  underlying  SIRP  co- 
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variance  matrix  and  characteristic  probability  density  function 
(PDF)  are  unknown.  Knowledge  of  the  SIRP  characteristic 
PDF  is  assumed  in  this  paper  as  a  first  step  toward  addressing 
the  problem.  This  information  can  be  gained  from  estimates  of 
the  first  order  PDF  obtained  from  experimental  data  using  his¬ 
togram  or  moment  techniques  [20],  A  significant  performance 
penalty  is  incurred  if  this  information  is  unavailable.  This  fact 
is  illustrated  through  an  example  in  Section  IV. 

The  main  contributions  of  this  paper  are  summarized  below. 

1)  Reduce  the  NHD  problem  for  SIRP  interference  sce¬ 
narios  to  one  of  testing  whether  two  data  sets  share  a 
common  covariance  structure  but  have  different  levels 
by  proper  use  of  the  maximum  likelihood  estimate  of 
the  covariance  matrix. 

2)  Provide  a  formal  goodness-of-fit  test  using  a  scale  in¬ 
variant  test  statistic. 

3)  Introduce  analytical  expressions  for  the  NHD  PDF, 
which  enable  calculation  of  the  threshold  setting  for 
the  NHD  test. 

4)  Analyze  the  performance  of  the  NHD  test  using  simu¬ 
lated  and  measured  radar  data. 

5)  Compare  the  performance  with  existing  NHD  tests, 
which  demonstrates  superior  performance  of  the  NHD 
test  of  this  paper  in  both  SIRP  as  well  as  Gaussian  sce¬ 
narios. 

II.  Preliminaries 

Let  x  =  [xi  X2  ■  ■  ■  xm)t  denote  a  complex  spherically  in¬ 
variant  random  vector  (SIRV)  having  zero  mean,  positive  def¬ 
inite  Hermitian  covariance  matrix  R  and  characteristic  PDF 
fv{v).  The  PDF  of  x  is  given  by  [21] 

f(x)  =  ~1h2M{q)  (1) 

where  |.|  denotes  determinant,  and 
q=xHR~1x 

h2 m{w)  =  J  u_2Mexp  fv(v)dv.  (2) 

Every  SIRV  admits  a  representation  of  the  form  [22]  x  =  zV, 
where  z  has  a  complex-Gaussian  PDF  CN( 0,  R),  and  V  is  a  sta¬ 
tistically  independent  random  variable  with  PDF  /y  (u).  Conse¬ 
quently,  the  covariance  matrix  of  x  is  given  by  Rx  =  RiJ( V2). 
In  practice,  R  and  fv(v)  are  unknown.  For  the  purpose  of  this 
paper,  we  assume  knowledge  of  /y  (u)  and  treat  the  problem  of 
nonhomogeneity  detection  with  respect  to  unknown  R.  Validity 
of  the  SIRP  model  for  clutter  encountered  in  STAP  applications 
has  been  extensively  discussed  in  [23]. 

Previous  work  [1]— [4],  [8]— [1 1],  [24]  employed  the 
GIP-based  NHD  for  Gaussian  interference  scenarios.  The 
GIP-based  method  relies  on  the  statistics  of  a  quadratic  form 
given  by  Q  =  xHR~1x.  This  method  can  be  used  as  an 
NHD  test  statistic  in  SIRV  intereference  if  a  perfect  estimate 
of  the  covariance  matrix  can  be  obtained,  which  calls  for  an 
extremely  large  sample  support  size  (infinite  sample  support). 
However,  in  practice,  the  training  data  available  in  a  given 


application  is  limited  by  system  considerations  such  as  the 
bandwidth  and  fast  scanning  arrays  and  more  fundamentally 
the  underlying  spatio-temporal  nonstationarity  of  the  scenario. 
Thus,  one  is  almost  always  forced  to  work  with  finite  sample 
support.  Consequently,  the  covariance  matrix  estimate  for  this 
problem  can  only  be  obtained  to  within  a  constant  of  the  sample 
covariance  matrix,  which  is  the  maximum  likelihood  estimate 
of  the  covariance  matrix  underlying  the  Gaussian  component 
of  the  SIRV.  Typically,  this  constant  is  unknown  in  practice. 
Hence,  the  goodness-of-fit  tests  proposed  in  [3],  [4],  [9],  and 
[10]  cannot  be  properly  implemented  for  this  problem.  On 
the  other  hand,  implementation  of  the  NHD  tests  proposed  in 
[3],  [4],  [9]— [1 1],  and  [24]  using  the  sample  covariance  matrix 
estimate  for  R  in  SIRV  scenarios  leads  to  incorrect  declaration 
of  data  nonhomogeneity.  This  fact  is  illustrated  in  the  examples 
presented  in  Section  IV.  Therefore,  we  seek  a  scale-invariant 
test  statistic  for  this  problem. 


III.  Nonhomogeneity  Detector  for  Non-Gaussian 
Interference  Scenarios 

Let  x  ~  SIRVfO,  R,  /y  (v)]  denote  the  complex  SIRV 
test  data  vector,  where  R  is  unknown.  Further,  let  x,, 
i  =  1, 2, . . .  K,  denote  iid  complex  SIRV[0,  R,  fv(v )]  training 
data.  The  first  step  in  deriving  the  NHD  for  SIRV’s  involves 
obtaining  the  maximum  likelihood  estimate  of  the  underlying 
covariance  matrix.  This  estimate  is  then  used  in  a  test  statistic 
that  exhibits  maximal  invariance  with  respect  to  the  unknown 
scaling  of  the  estimated  covariance  matrix.  The  resulting  test 
statistic  takes  the  form  of  a  normalized  adaptive  matched  filter 
(NAMF),  which  has  been  extensively  analyzed  in  [25]— [27]  and 
references  therein.  As  noted  previously,  the  basis  of  our  strategy 
to  detect  nonhomogeneity  in  the  data  is  to  first  characterize 
the  NHD  PDF  in  homogeneous  SIRP  clutter  scenarios  and  use 
this  information  to  construct  a  formal  goodness-of-fit  test  for 
rejecting  data  realizations  that  depart  from  the  said  PDF. 

A.  Covariance  Matrix  Estimation 

The  unknown  covariance  matrix  is  estimated  from  represen¬ 
tative  SIRV  training  data  sharing  the  covariance  structure  of 
that  of  the  test  cell.  Maximum  likelihood  (ML)  estimation  of 
the  covariance  matrix  for  SIRVs  was  first  considered  in  [28]. 
The  work  of  [28]  showed  that  covariance  matrix  estimation  for 
SIRVs  can  be  treated  in  the  framework  of  a  complete-incom¬ 
plete  data  problem  and  pointed  out  that  the  maximum  likeli¬ 
hood  estimate  of  the  covariance  matrix  is  a  weighted  sample 
matrix.  Since  the  covariance  matrix  estimate  cannot  be  obtained 
in  closed  form,  [28],  [29]  use  an  iterative  method  known  as  the 
expectation-maximization  (EM)  algorithm.  More  precisely,  let 
Xj,  i  =  1, 2, . . . ,  K  denote  iid  training  data  sharing  the  covari¬ 
ance  matrix  of  the  test  data  vector  x.  The  work  of  [28]  and  [29] 
shows  that  the  ML  estimate  of  the  covariance  matrix  is  given  by 


1 

R  =  ~  ^  c^xf  (3) 

1=1 
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where 

c.  —  _ 

h'2M(w)  =  dk2^W)  =  -  WsM  (4) 

and  qi  —  xfR-1Xj,  i  —  1, 2, ...  K.  Since  both  sides  of  (3) 
involve  R  (the  right-hand  side  implicitly  through  a),  it  is  not 
possible  to  obtain  the  estimate  in  closed  form.  Consequently, 
[28]  used  the  EM  algorithm  to  obtain  an  iterative  solution  to  the 
problem.  We  adopt  the  approach  of  [28]  for  obtaining  the  covari¬ 
ance  matrix  estimate  in  this  work.  A  derivation  of  the  covariance 
matrix  estimate  is  contained  in  the  Appendix.  We  note  therein 
that  the  EM  algorithm  yields  an  estimate  that  is  to  within  a  mul¬ 
tiplicative  constant  of  the  sample  covariance  matrix,  which  is  the 
ML  estimate  of  the  covariance  matrix  underlying  the  Gaussian 
component  of  the  SIRV.  This  fact  was  verified  for  all  the  simu¬ 
lated  data  examples  presented  in  Section  IV  by  examining  the 
eigenvalues  of  the  estimated  covariance  matrix  obtained  at  the 
convergence  of  the  EM  algorithm.  Details  pertaining  to  the  ini¬ 
tial  start  and  convergence  properties  of  the  EM  algorithm  can 
be  found  in  [28],  The  next  step  is  to  use  this  estimate  in  a  maxi¬ 
mally  invariant  decision  statistic  for  nonhomogeneity  detection. 

Recognizing  the  need  to  know  the  characteristic  SIRV  PDF, 
which  may  be  hard  to  obtain  in  some  practical  applications, 
the  works  of  [30]  and  [31]  propose  recursive  covariance  ma¬ 
trix  estimators  for  the  class  of  non-Gaussian  processes  where 
the  random  variable  V  of  the  SIRP  model  is  treated  as  a  de¬ 
terministic  but  unknown  parameter.  Strictly  speaking,  the  non- 
Gaussian  model  used  in  [30]  and  [31]  departs  from  the  SIRP 
model  due  to  the  treatment  of  V  as  a  deterministic  but  unknown 
scale  factor.  However,  it  serves  as  a  useful  alternative  model  in 
some  instances. 


B.  Maximally  Invariant  NHD  Test  Statistic 

The  maximal  invariant  statistic  for  different  scaling  of  test 
and  training  data  is  given  by  [25] 


Anamf  = 


|sHR_1xl2 


[sf/R_1s][xHR_1x] 


(5) 


where  s  =  (l/\/M)[l  1  ...  1]T.  For  convenience,  we  use  a 
simple  choice  for  s  by  designating  it  to  be  the  first  column  of  a 
normalized  discrete  Fourier  transform  (DFT)  matrix.  However, 
in  most  STAP  applications,  the  spatio-temporal  steering  vector 
is  a  function  of  azimuthal  angle  and  Doppler.  Bearing  in  mind 
that  we  are  concerned  about  training  data  containing  contami¬ 
nating  targets,  which  share  the  same  angle-Doppler  information 
as  that  of  a  desired  target,  the  spatio-temporal  steering  vector  is 
a  logical  choice  for  s. 

The  test  statistic  of  (5)  has  also  been  proposed  as  a  subop- 
timal  method  for  adaptive  radar  target  detection  in  compound- 
Gaussian  clutter  [32],  Invariance  properties  of  the  test  statistic 
of  (5)  and  its  geometrical  representation  have  been  studied  in 
[25]  and  references  therein  for  the  case  of  Gaussian  interference 
statistics  using  a  sample  covariance  matrix  estimate.  In  SIRP 
interference,  however,  each  training  data  vector  is  scaled  by  a 
different  realization  of  V.  Since  V  varies  independently  from 


one  training  data  vector  to  another,  maximal  invariance  of  the 
test  statistic  of  (5)  afforded  by  the  sample  covariance  matrix  es¬ 
timate  no  longer  applies.  This  is  due  to  the  fact  that  the  sample 
covariance  matrix  is  no  longer  the  maximum  likelihood  estimate 
of  the  covariance  matrix  for  SIRV  scenarios  [33], 

However,  using  an  estimated  covariance  matrix  of  the  form  of 
(3)  restores  the  maximal  invariance  property  of  the  test  statistic 
of  (5).  This  is  due  to  the  fact  that  the  resultant  covariance  matrix 
estimate  at  convergence  of  the  EM  algorithm  is  to  within  a  mul¬ 
tiplicative  constant  of  the  sample  covariance  matrix.  This  fact 
is  formally  demonstrated  in  Section  IV  through  simulation  by 
comparing  the  empirical  data  cumulative  distribution  function 
(CDF)  of  a  simple  transformation  on  Anamf  with  its  theoret¬ 
ical  CDF  for  several  cases  and  supplementing  the  results  with 
a  Kolmogorov-Smirnov  test  [34].  This  behavior  has  also  been 
verified  for  all  the  simulated  data  examples  presented  in  Sec¬ 
tion  IV  by  examining  the  eigenvalues  of  the  estimated  covari¬ 
ance  matrix  and  the  eigenvalues  of  the  sample  covariance  matrix 
formed  from  the  averaged  outer  products  of  the  Gaussian  com¬ 
ponent  underlying  the  SIRV  data.  Consequently,  we  now  have 
a  case  where  the  covariance  matrix  of  the  test  and  training  data 
share  the  same  structure  but  have  different  unknown  scaling.  It 
has  been  established  in  [25]  that  Anamf  is  the  invariant  test 
statistic  for  this  problem.  Hence,  the  canonical  representation 
for  Anamf  in  terms  of  five  random  variables  derived  in  [25] 
applies  to  this  problem  in  a  straightforward  manner.  However, 
we  emphasize  that  it  is  important  to  properly  estimate  the  SIRV 
covariance  matrix  in  order  to  reduce  the  NHD  problem  to  the 
case  where  test  and  training  data  covariance  matrices  differ  by 
an  unknown  scale  factor.  This  calls  for  knowledge  of  the  first 
order  SIRV  characteristic  PDF. 


C.  PDF  and  Moments  of  the  Non-Gaussian  NHD  Test  Statistic 

Our  comments  in  the  concluding  paragraph  of  Section  III-B 
allow  us  to  use  the  canonical  representation  for  Anamf 
contained  in  [25]  for  Gaussian  interference  scenarios.  Conse¬ 
quently,  the  PDF  of  the  NHD  test  statistic  is  readily  determined 
in  terms  of  a  random  variable  Aeq  obtained  from  a  transforma¬ 
tion  on  Anamf  defined  by 


.  _  Anamf 

eq  1  —  Anamf 


(6) 


Since  Aeq  is  a  monotonic  function  of  Anamf  >  one  can  work 
with  either  statistic.  For  convenience  of  analysis,  we  work  with 
the  PDF  of  Acq  in  the  sequel.  It  has  been  shown  in  [25],  [27],  and 
[35]  for  Gaussian  interference  statistics  that  Apq  admits  a  rep¬ 
resentation  in  terms  of  an  F-distributed  random  variable  P  and 
a  beta-distributed  loss  factor  T.  However,  use  of  the  covariance 
matrix  estimate  of  the  form  of  (3)  transforms  the  NHD  problem 
in  SIRV  interference  to  that  of  testing  whether  two  data  sets 
share  the  same  covariance  structure  with  differing  scale.  Con¬ 
sequently,  the  results  of  [25],  [27],  and  [35]  readily  extend  to 
the  SIRV  problem.  More  precisely,  for  the  case  where  no  target 
is  present  in  x  (homogeneous  data),  Aeq  admits  a  representation 
of  the  form 


Aeq  — 


P 

i  -r' 


(7) 
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The  PDFs  of  P  and  T  are  given  by 
L 


fp{P )  = 
fr{l)  = 


(1  +p)L+1 
1 


p(L+l,M-l) 
where  L  —  K  —  M  +  1,  and 


p  >  0 

7L(l-7  )W-2 


0  <  7  <  1  (8) 


P(m,,n)  =  f  xm  X(1  —  rc)n  1dx. 
Jo 


(9) 


After  a  little  bit  of  algebra,  it  follows  that  that  the  PDF  of  Acq 
with  no  target  present  in  x  (homogeneous  data)  is  given  by 


L{  1  ~  7)/r(7)^7 

[1  +  (1  -  7)r]L+1 


(10) 


while  the  pdf  of  Anamf  with  no  target  present  in  x  (homoge¬ 
neous  data)  is  given  by 


/ Anamf 


L(  1  -  7)/r(7)^7  1 

[l  +  (l-7)^:]i+1  (1-r)2' 


(II) 


The  mean  of  Anamf  is  difficult  to  calculate  analytically.  Con¬ 
sequently,  we  work  with  the  mean  of  Acq  given  by 


E(  Aeq) 


K 

(K  —  M)(M  —  2) 


M  >2 


(12) 


to  study  the  convergence  properties  in  the  limit  of  large  K.  The 
statistical  equivalence  of  Acq  to  the  ratio  of  an  F-distributed 
random  variable  and  a  beta-distributed  loss  factor  permits  rapid 
calculation  of  the  moments  of  Acq.  It  is  also  extremely  useful  in 
Monte  Carlo  studies  involving  simulation  of  Anamf-  For  ho¬ 
mogeneous  training  data,  the  use  of  (7)  circumvents  the  need 
to  explicitly  generate  the  test  data  vector  x  and  the  training 
data  vectors  used  for  covariance  estimation.  For  large  M  and, 
hence,  large  K,  significant  computational  savings  can  be  real¬ 
ized  from  the  method  of  (7).  It  is  instructive  to  note  that  the  PDF 
of  Acq  depends  only  on  M  and  K,  which  are  under  the  con¬ 
trol  of  a  system  designer  and  not  on  nuisance  parameters  such 
as  the  true  covariance  matrix  underlying  the  interference  sce¬ 
nario.  Furthermore,  for  K  — >  oo,  the  mean  of  (12)  converges  to 
.E(Acq)  =  1  /(M  —  2)  M  >  2,  corresponding  to  the  mean  of 
an  F-distributed  random  variable.  This  is  due  to  the  fact  that  as 
K  — >  oo,  the  estimated  covariance  matrix  approaches  the  true 
covariance  matrix  with  probability  one,  and  thus,  the  loss  factor 
takes  on  the  value  zero  with  probability  one. 


D.  Goodness-of-Fit  Test 

Since  the  PDF  of  Acq  and  Anamf  are  known,  a  formal 
goodness-of-fit  test  can  be  used  for  nonhomogeneity  detec¬ 
tion  in  non-Gaussian  interference  scenarios.  Specifically,  the 
goodness-of-fit  test  can  be  cast  in  the  form  of  the  following 
statistical  hypothesis  test: 

Ho  :Anamf  is  statistically  consistent  with 
the  pdf  of  (11) 

H i  :Anamf  is  not  statistically  consistent  with 
the  pdf  of  (11). 


Fig.  1.  Aeq  PDF  of  (10)  with  varying  K\  M  =  64. 


For  this  purpose,  we  need  to  determine  the  type-I  error  [34] 
given  by 


Pe  =  P( Anamf  >  v\Ho)  =  -P[Aeq  >  ^  .  |ffo]-  (13) 

l1  -v) 


The  type-I  error  is  the  probability  of  observing  under  H0  a 
sample  outcome  at  least  as  extreme  as  the  one  observed  [34]  and 
hence  provides  the  smallest  level  at  which  the  observed  sample 
statistic  is  significant.  Using  (6)  and  (8),  it  follows  that  the  prob¬ 
ability  of  error  conditioned  on  T  is  given  by 


Pe|r  = 


1 

[1  +  (1  -  7)77*]^ 


(14) 


where  rf  —  77/(1  —  77).  The  unconditional  type-I  error  proba¬ 
bility  is  obtained  by  taking  the  expectation  of  (14)  over  T  and  is 
given  by 


Pe  l  {i  +  (i-l)n^dr  (15) 

In  this  paper  the  type-I  error  is  chosen  to  be  0.01 .  The  threshold 
77*  is  determined  by  a  numerical  inversion  of  (15).  The  value 
of  77  follows  from  the  relationship  77  =  77* /(I  +  if).  We  then 
form  empirical  realizations  of  Anamf  from  each  training  data 
vector  using  a  sliding  window  approach.  In  this  approach,  each 
training  data  vector  is  treated  as  a  test  cell  data  vector,  whose 
covariance  matrix  is  estimated  from  neighboring  cell  data  ac¬ 
cording  to  (3).  We  then  test  for  statistical  consistency  of  the  re¬ 
alizations  of  Anamf  with  the  theoretical  PDF  of  (1 1).  Realiza¬ 
tions  of  Anamf-  which  exceed  77,  correspond  to  nonhomoge- 
neous  training  data.  A  desirable  feature  of  Pe  is  that  it  depends 
only  on  K  and  M  and  not  on  nuisance  parameters  such  as  the 
true  covariance  matrix  underlying  the  interference.  Performance 
analysis  of  the  NHD  method  is  presented  in  the  next  section. 

IV.  Performance  Analysis 

Performance  of  the  goodness-of-fit  test  with  simulated  and 
measured  data  is  presented  here.  Fig.  1  shows  the  plot  of  the 
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Fig.  2.  Type-I  Error  versus  r)'  calculated  using  (15):  M  =  64. 


PDF  of  Aeq  obtained  from  (10)  with  K  as  a  parameter.  Observe 
that  the  variance  of  Aeq  decreases  with  increasing  K.  This  is  due 
to  the  fact  that  better  covariance  matrix  estimates  result  with  in¬ 
creasing  K  and  when  K  — >  oo,  the  estimated  covariance  matrix 
approaches  R  with  probability  1. 

Fig.  2  shows  a  plot  of  the  Type-I  error  versus  77*,  with  K  as  a 
parameter.  For  a  given  type-I  error,  the  threshold  decreases  with 
increasing  K,  in  conformance  with  the  results  of  Fig.  1. 

For  convenience  of  analysis,  simulated  data  examples  con¬ 
tained  herein  use  the  K-distributed  amplitude  PDF  given  by 
[12],  [13],  [21] 

fn(r)  =  ^-irCa)Ka-libr)  r  ^  °-  6’  a  >  0  (16) 

where  b  and  a  are  the  distribution  scale  and  shape  parameters, 
respectively,  Kv{.)  is  the  modified  Bessel  function  of  the  second 
kind  of  order  v,  and  r(.)  is  the  Eulero-Gamma  function.  The 
K-distribution,  which  is  a  member  of  the  class  of  SIRPs  [12], 
has  been  proposed  as  a  model  for  impulsive  clutter  resulting 
from  terrain  and  sea  scatter  [36],  [37],  Small  values  of  a  result 
in  heavy  tails  for  the  PDF  of  (16).  The  corresponding  fv(v)  and 
/i2A*(.)  are  given  by 

fv(v)  =  exp(-fcV)  0  <  v  <  00 

o  h2M 

hauH  =  Tr^(b^)a-MKa-M(2b^).  (17) 

r(a) 

First,  Figs.  3-5  demonstrate  the  invariance  of  the  PDF  of  (10) 
for  the  K-distribution  (a  =  0.1,  0.5)  and  the  Gaussian  case 
(a  — *  00).  More  precisely,  these  figures  show  plots  of  the  the¬ 
oretical  CDF  obtained  from  (10)  and  the  empirical  CDF  of  Acq 
formed  from  homogeneous  simulated  data  using  (6).  Relevant 
test  parameters  are  reported  in  the  plots.  First,  the  covariance 
matrix  is  estimated  from  K  =  2 M  homogeneous  training  data 
realizations  using  the  EM  algorithm  discussed  in  Section  III-A. 


Empirical  and  Theoretical  CDF 


Fig.  3.  Theoretical  and  empirical  CDFs  of  Aeq:  Gaussian  interference. 


Theoretical  and  Empirical  CDF 


Fig.  4.  Theoretical  and  empirical  CDF’s  of  Aeq:  K-Distributed  interference 
o  =  0.5. 

Then,  Aeq  is  formed  using  1000  independent  realizations  of  ho¬ 
mogeneous  data  and  its  empirical  CDF  is  compared  to  its  theo¬ 
retical  CDF  given  by 

+  <l8> 

In  all  cases,  the  empirical  CDF  shows  good  agreement  with  the 
theoretical  CDF.  This  was  further  supplemented  by  performing 
a  Kolmogorov-Smimov  test  [34]  to  determine  the  statistical 
consistency  of  realizations  of  Aeq  from  the  simulated  data  with 
the  theoretical  PDF  of  (10).  In  all  cases,  we  could  not  reject  the 
hypothesis  that  the  realizations  of  Acq  formed  from  the  simu¬ 
lated  data  using  conform  to  the  PDF  of  ( 1 0)  at  a  1  %  significance 
level.  These  findings  confirm  the  scale  invariance  of  Aeq  (and, 
hence,  Anamf)  and  verify  the  correctness  of  the  PDF  given  by 
(10). 


Fig.  5.  Theoretical  and  empirical  CDF’s  of  Aeq:  K-Distributed  interference 
cy  =  0.1. 


Fig.  6.  An  amf  versus  range  bin  number  for  homogeneous  K-distributed 
SIRV  with  rr  =  0.5,  b  =  \fc\,  M  =  G4,  and  K  =  128. 

We  then  generate  1024  realizations  of  a  64  tuple  vector  from 
the  K-distributed  SIRP  with  a  =  0.5  having  a  prescribed  co- 
variance  matrix  according  to  the  physical  model  described  in 
[38]  using  the  approach  of  [13].  No  targets  are  added  to  this  data 
set.  Starting  from  the  midpoint  (range  bin  512),  the  data  set  is 
processed  symmetrically  on  either  side  using  a  sliding  window. 
Each  cell  is  treated  as  a  test  cell  (which  may  or  may  not  contain 
contaminating  targets).  Two  Guard  cells  are  provided  (one  on 
each  side  of  the  test  cell).  One  hundred  and  twenty  eight  training 
data  realizations  are  collected  by  moving  symmetrically  on  ei¬ 
ther  side  of  the  guard  cells  for  use  in  covariance  matrix  esti¬ 
mation.  The  covariance  matrix  estimate  is  obtained  using  (3). 
Anamf  given  by  (5)  is  then  calculated  for  each  test  cell  using 
the  estimated  covariance  matrix  and  compared  to  a  threshold  de¬ 
termined  from  (15)  for  Pe  =  0.01.  Relevant  test  parameters  are 
reported  in  the  plots. 

Fig.  6  shows  the  performance  of  the  goodness-of-fit  test  for 
simulated  homogeneous  data  from  the  K-distribution  [21]  with 
shape  parameter  0.5  using  the  covariance  estimate  of  (3).  The 


Fig.  7.  Normalized  GIP  (x"S  'x/iQ  versus  range  bin  number  for 
homogeneous  K-distributed  SIRV  with  a  —  0.5,  b  =  -Ja.,  M  =  G4,  and 
K  =  128. 

figure  shows  a  plot  of  Anamf  as  a  function  of  range.  No  re¬ 
alization  of  Anamf  exceeds  rj,  reflecting  homogeneity  of  the 
data.  The  experiment  was  repeated  1 000  times,  and  in  all  cases, 
Anamf  did  not  exceed  rj,  confirming  the  homogeneity  of  the 
data. 

Fig.  7  shows  the  performance  of  the  NHD  test  proposed  in 
[10],  [11],  and  [24]  based  on  comparing  the  normalized  GIP 
xHS~1x/K,  with  the  threshold-setting  determined  according 
to  [24,  eq.  (4.2)].  Here,  S  =  Xjxf  is  simply  the 

sample  covariance  matrix.  The  data  set  used  here  is  the  same 
as  the  data  set  used  for  the  example  in  Fig.  6.  The  normalized 
GIP  is  formed  using  sliding  window  processing,  as  described 
in  [10],  [11],  and  [24].  Fig.  7  shows  a  plot  of  the  normalized 
GIP  as  a  function  of  range.  The  threshold  setting  is  also  plotted. 
From  the  plot,  it  is  evident  that  for  almost  all  range  bins  the 
normalized  GIP  exceeds  the  threshold,  leading  to  the  declaration 
of  nonhomogeneity,  when  in  fact,  the  data  is  homogeneous. 

Fig.  8  shows  the  performance  of  a  second  goodness-of-fit  test 
proposed  in  [10],  [11],  and  [24],  which  compares  the  normal¬ 
ized  GIP  xHS  ~1x/K  to  a  theoretically  calculated  mean  value 
obtained  from  [24,  eq.  (3.6)].  The  data  used  for  this  example  is 
the  same  as  that  used  in  the  example  of  Fig.  6.  Fig.  8  shows  a 
plot  of  the  normalized  GIP  as  a  function  of  range.  The  theoret¬ 
ically  calculated  mean  value  is  also  shown.  Again,  we  see  that 
for  almost  all  range  bins,  the  normalized  GIP  exceeds  the  mean 
value,  causing  an  incorrect  declaration  of  data  nonhomogeneity. 

Fig.  9  shows  the  performance  of  the  NHD  test  proposed  in 
[3],  [4],  and  [9],  which  compares  the  GIP  xHS~1x  to  a  theo¬ 
retically  specified  mean  value  of  M.  The  data  used  for  this  ex¬ 
ample  is  the  same  as  that  used  in  the  example  of  Fig.  6.  Fig.  9 
shows  a  plot  of  the  GIP  as  a  function  of  range.  The  theoreti¬ 
cally  specified  mean  value  is  also  shown.  Again,  we  see  that  for 
almost  all  range  bins,  the  GIP  exceeds  the  mean  value  causing 
an  incorrect  declaration  of  data  nonhomogeneity.  The  incorrect 
declaration  of  nonhomogeneity  is  due  to  the  fact  that  S  is  no 
longer  the  ML  estimate  of  the  covariance  matrix  for  SIRV  inter¬ 
ference  scenarios.  Similar  results  showing  an  even  more  severe 
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Fig.  8.  Normalized  GIP  (xHS~'x/K)  versus  range  bin  number  for 
homogeneous  K-distributed  SIRV  with  rv  =  0.5,  b  =  y/a,  M  =  64,  and 
K  =  128. 


Fig.  9.  GIP  (x^S^x)  versus  range  bin  number  for  homogeneous 
K-distributed  SIRV  with  or  =  0.5, 6  =  y/o,  M  =  64,  and  K  =  128. 


performance  degradation  in  K-distributed  clutter  with  a  =  0.1 
were  obtained.  However,  these  results  are  not  reported  here  in 
the  interest  of  avoiding  tediousness  of  exposition.  The  experi¬ 
ments  pertaining  to  Figs.  7-9  were  repeated  1000  times,  and  all 
the  trials  exhibited  performance  consistent  with  that  are  reported 
in  Figs.  7-9. 

Fig.  10  shows  the  performance  of  the  goodness-of-fit  test  de¬ 
veloped  in  this  paper  in  K-distributed  clutter  with  shape  param¬ 
eter  0.5.  Synthetic  targets  were  injected  at  range  bins  479  and 
510  to  cause  the  nonhomogeneity.  Nonhomogeneity  of  the  data 
is  evident  in  those  range  bins  where  Anamf  exceeds  rj. 

Fig.  1 1  shows  the  performance  of  the  goodness-of-fit  test  in 
K-distributed  clutter  with  a  =  0.1.  Synthetic  targets  were  in¬ 
jected  at  range  bins  510  and  552  to  cause  the  nonhomogeneity. 
Clearly,  Anamf  exceeds  rj  for  both  of  these  range  bins,  and  thus, 
they  are  declared  to  be  nonhomogeneous  data  sets. 

Fig.  12  shows  the  results  of  the  goodness-of-fit  test  using  the 
covariance  matrix  estimator  proposed  in  [30]  and  [31].  This  es¬ 
timator  does  not  require  knowledge  of  the  first-order  charac¬ 
teristic  PDF  of  the  SIRV  and,  therefore,  converges  faster  than 


Range 

Fig.  10.  Anamf  versus  range  bin  number  for  nonhomogeneous  K-distributed 
SIRV  with  rv  =  0.5,  b  =  y/S,  M  =  64,  and  K  =  128. 


Fig.  1 1 .  Anamf  versus  range  bin  number  for  nonhomogeneous  K-distributed 
SIRV  with  or  =  0.1, 6  =  y/o,  M  =  64,  and  I<  =  128. 


Fig.  12.  Anamf  versus  range  bin  number  for  nonhomogeneous  K-distributed 
SIRV  with  a  —  0.5,  b  =  y/a,  M  =  64,  K  =  128,  and  the  covariance  matrix 
estimate  of  [30],  [31], 


2108 


IEEE  TRANSACTIONS  ON  SIGNAL  PROCESSING,  VOL.  53,  NO.  6,  JUNE  2005 


TABLE  I 

NHD  Performance  Summary 


Fig.  Number 

Range  Bin 

Realizations 

Exceedences 

10 

510 

1000 

997 

10 

479 

1000 

997 

11 

510 

1000 

984 

11 

552 

1000 

984 

12 

573 

1000 

971 

Fig.  13.  Anamf  versus  range  bin  number  using  MCARM  data:  M  =  128, 
I<  =  256. 


the  estimator  of  (3),  especially  for  small  values  of  a.  The  data 
set  used  for  this  example  is  the  same  as  that  used  for  the  ex¬ 
ample  in  Fig.  10.  Although  a  peak  in  the  test  statistic  is  seen 
at  range  bin  479,  it  does  not  exceed  the  threshold,  whereas  the 
peak  is  not  seen  for  range  bin  510.  Therefore,  contaminating 
targets  in  range  bins  479  and  510  are  not  detected.  Further¬ 
more,  the  method  erroneously  reports  the  presence  of  a  contam¬ 
inating  target  at  range  bin  573.  This  illustrates  the  importance  of 
knowing  the  underlying  characteristic  PDF  to  properly  estimate 
the  covariance  matrix  and  use  it  in  the  NHD  statistic. 

The  results  contained  in  Figs.  10-12  were  further  validated  by 
using  1000  realizations  of  the  experiment  and  averaging  the  re¬ 
sults  over  50  independent  trials.  In  997  out  of  the  1000  trials,  the 
NHD  realizations  corresponding  to  bins  479  and  510  of  Fig.  10 
exceeded  the  threshold.  For  Fig.  11,  in  984  times  out  of  1000 
trials,  the  realizations  corresponding  to  range  bins  5 10  and  552 
exceeded  the  threshold.  The  corresponding  number  for  Fig.  12 
was  971.  These  findings  are  summarized  in  Table  I. 

The  examples  reported  in  Figs.  13-15  make  use  of  the 
MCARM  data  [3].  The  MCARM  data  consists  of  measured 
L-band  radar  data  using  a  Westinghouse  radar  mounted  on 
the  port-side  of  a  B AC  1-11  aircraft.  The  relevant  system  pa¬ 
rameters  are  summarized  in  Table  II.  The  MCARM  data  is  a 
common  test  bed  for  performance  analysis  and  bench-marking 
of  STAP  algorithms  and  is  therefore  considered  in  this  paper. 
Further  details  pertaining  to  the  MCARM  data  can  be  found  in 
[39].  Fig.  13  shows  the  results  of  the  goodness-of-fit  test  for 
the  MCARM  data  using  acquisition  220  on  Flight  5,  cycle  e  for 
eight  channels  and  16  pulses.  The  results  of  [3],  [40],  and  [41] 


Range 


Fig.  14.  Normalized  G1P  (x"S  lx/K)  versus  range  bin  number  using 
MCARM  data:  M  =  128,  I<  =  25C. 


Range 


Fig.  15.  GIP(x"S  ’x)  versus  range  bin  number  using  MCARM  data:  M  = 
128,  K  =  256. 


TABLE  II 

MCARM  Data  Parameters 


Parameter 

Value 

Transmit  Frequency 

1240  MHz 

Transmit,  Beamwidth 

6.7°  Az.,  10.4°  El 

Waveform 

50.4  ps  LFM 

Peak  Transmit.  Power 

20  kw 

Pulse  Compression  Ratio 

63 

Platform  Altitude 

10.000  ft 

Platform  Velocity 

100  meters/sec 

Array  Configuration 

11x11  Planar  Array 

Number  of  Pulses 

128 

Pulse  Repetition  Frequency 

2  kHz 

Number  of  Unambiguous  Range  Bins 

630 

using  the  MCARM  tend  to  confirm  that  the  MCARM  data  is 
homogeneous  for  the  most  part.  Statistical  analysis  of  the  data 
indicates  that  the  data  is  well-approximated  by  the  Gaussian  dis¬ 
tribution  [3],  As  a  consequence,  c,;  =  —h'2M  (q, ) jhj.M (q, )  =  1 
for  this  case.  Hence,  the  maximum  likelihood  estimate  of  the 
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covariance  matrix  is  simply  the  sample  covariance  matrix.  The 
test  statistic  Anamf  and  the  threshold  77  are  plotted  as  a  func¬ 
tion  of  range.  Nonhomogeneity  of  the  data  is  evident  in  those 
bins  for  which  Anamf  exceeds  77.  For  the  sake  of  comparison. 
Figs.  14  and  15  show  the  performance  of  the  NHD  methods  of 

[10] ,  [11],  and  [24]  and  [3],  [4],  and  [9],  respectively,  using  the 
same  MCARM  data  set  used  in  the  example  of  Fig.  13.  Fig.  14 
shows  the  results  of  the  NHD  test  proposed  in  [10],  [11],  and 
[24]  based  on  comparing  the  normalized  GIP  xHS ~lx/K  with 
the  threshold  setting  determined  according  to  [24,  eq.  (4.2)]. 
The  MCARM  data  set  is  processed  in  the  same  manner  as  de¬ 
scribed  in  the  example  of  Fig.  7.  Fig.  15  shows  the  performance 
of  the  NHD  test  proposed  in  [3],  [4],  and  [9],  which  compares 
the  GIP  xHS_1x  to  a  theoretically  specified  mean  value  of  M 
using  the  MCARM  data.  The  MCARM  data  set  is  processed 
in  the  same  manner  as  described  in  the  example  of  Fig.  9. 
It  is  seen  from  Figs.  14  and  15  that  a  lot  more  declarations 
of  nonhomogeneity  result  from  the  NHD  methods  of  [10], 

[1 1] ,  and  [24]  and  [3],  [4],  and  [9],  when  in  fact  the  MCARM 
data  is  homogeneous.  Thus,  the  NHD  method  of  this  paper 
outperforms  competing  techniques  for  Gaussian  interference 
scenarios  as  well,  in  terms  of  the  type-I  error  performance. 


V.  Conclusion 

This  paper  provides  a  statistical  characterization  of  the 
NHD  for  non-Gaussian  interference  scenarios,  which  can  be 
modeled  as  a  spherically  invariant  random  process.  A  formal 
goodness-of-fit  test  based  is  derived.  Performance  analysis  of 
the  method  is  considered  in  some  detail  using  simulated  as  well 
as  measured  data  from  the  MCARM  program.  The  performance 
comparison  of  the  method  with  other  NHD  techniques  is  also 
undertaken.  The  illustrative  examples  validate  the  approach 
taken  and  confirm  the  improved  performance  of  the  technique 
of  this  paper  in  both  Gaussian  and  non-Gaussian  interference 
scenarios  with  respect  to  a  type-I  error  criterion.  Future  work 
would  include  extensive  performance  analysis  using  simulated 
and  measured  data  showing  the  resulting  impact  on  STAP 
performance.  The  performance  of  several  STAP  algorithms  in 
Gaussian  and  non-Gaussian  interference  scenarios  has  been 
considered  in  [26].  Future  work  will  address  performance  of  the 
methods  treated  in  [26]  in  conjunction  with  NHD  processing 
described  herein  to  combat  heterogeneous  interference  sce¬ 
narios.  Preliminary  work  (not  reported  here)  in  this  direction 
reveals  that  the  estimator  of  (3)  is  rather  slow  to  converge,  even 
for  moderate  system  dimension.  A  related  research  direction 
is  the  performance  comparison  of  model-based  parametric 
STAP  methods  (which  do  not  require  NHD  preprocessing) 
with  sample  covariance-based  STAP  methods  employing  NHD 
preprocessing  in  dense  target  environments.  Analysis  in  this 
direction  is  undertaken  in  [42]. 


Appendix 

EM  Algorithm  for  Covariance  Matrix  Estimation 


trix,  whose  columns  x;  i  =  1, 2, . . . ,  K  are  iid  training  data 
vectors,  which  are  distributed  as  SIRV[0,  Kx,fv{v)]-  The  like¬ 
lihood  function  for  estimating  R  is  given  by 

K 

7j[X|R]  =  n7r-M|Rr1/l2Mte).  (19) 

1=1 

Direct  maximization  of  the  likelihood  function  of  (19)  over  R 
is  rendered  difficult  due  to  the  fact  that  there  is  missing  in¬ 
formation.  Consequently,  it  is  helpful  to  treat  the  problem  in 
the  context  of  a  complete-incomplete  data  problem  [28],  Re¬ 
call  from  the  representation  theorem  for  SIRVs  [22]  that  xt  = 
ZiVi,  where  z ;,  i  =  1, 2, . . . ,  K  are  statistically  independent 
CN( 0,  R)  random  vectors,  and  Vi  i  =  1, 2, . . . ,  K  are  statis¬ 
tically  independent  random  variables  with  PDF  fv(v).  For  this 
problem,  the  complete  data  is  either  z i,V{,i  =  1, 2, . . . ,  K, 
or  Xi,  Vi,  i  =  1,2 ,K.  However,  the  observed  data  xi; 
i  =  1, 2, . . . ,  K  contains  no  explicit  information  about  V;,  i  = 
1,2 ,K  and,  thus,  constitutes  the  incomplete  data.  The  com¬ 
plete  data  likelihood  function  is  given  by  the  joint  PDF  of  x.;, 
Vi,  i  =  1, 2, . . . ,  K,  which  is  expressed  as 

K  K 

9c[X,  V;|R]  =  n  f(xi\V)  n  (20) 

i- 1  »=  1 

Taking  the  natural  logarithm  of  (20)  yields  the  complete-data 
log-likelihood  function  of  the  form 

K 

£[X,  V<|R]  =  -KM  log(7r)  -  Xlog(|R|)  -  ^T^f2 

i=i 

K 

+  £log  [v~2Mf(Vi)\.  (21) 

i—i 

Note  that  given  an  initial  estimate  of  R  denoted  by  R,  the  quan¬ 
tity 

£{log[«r2M/(«i)]|R}  (22) 

depends  only  on  R  and  not  on  R.  Consequently,  the  relevant 
terms  for  the  maximization  over  R  are  given  by 

K 

At[X, Vj|R]  =  -inog(|R|)  -YwT2-  (23) 

i=l 

The  missing  data  Vi,  i  =  1, 2, ...  K  are  assumed  to  be  missing 
at  random  (MAR)  [28],  Consequently,  given  an  initial  estimate 
of  R  denoted  by  R,  the  complete  data  sufficient  statistic  [28]  is 
given  by 


Ci  =  £[V;  2|R,  Xj].  (24) 


Thus,  Ci  is  simply  the  minimum  mean  squared  error  (MMSE) 
estimate  of  V~ 2  given  x.;.  Note  that  f(v  1)  =  f(v 2)  =  •  •  •  = 
f(vi<)  =  fv(v)  (since  Vi,  i  =  1, 2, ...  K  are  iid  random  vari¬ 
ables).  Therefore 


fvipti,  k(w*Ix*> 


^  _  f(xi\v.j,  R)/y(fi) 
/x;|R,(Xl'l-^') 


We  discuss  the  maximum  likelihood  estimation  of  the  SIRV 
covariance  matrix  in  this  appendix.  Let  X  denote  a  data  ma- 


(25) 
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However 

/(x, \vh  R .)fv(vj)  _  v~2AI  exp (qiv72)fv(vi)  (26) 
fXi  |R.(xi|R-)  h2M(qi) 

Consequently 

c  _  /0°°^~2M~2  exp (givr2)fv{vi)(tvi 
h2M{qi) 

_  ^2 AfCgt) 
h.2M(qi) 

_  h2M+i{qi) 

h2M(qi)  ' 

Having  specified  the  complete  data  sufficient  statistic,  we  seek 
the  maximization  of  (23).  For  this  purpose,  we  reproduce  the 
following  matrix  differentiation  identities  from  [43]: 

<5[R_1]  =  -  RT^RJR-1 
«[log|R-1|]=  -  tr{R-xS[R]}.  (28) 

Further,  we  recognize  that  qi  —  ^^LjtrpR^Xjxf*].  Conse¬ 
quently 

tfLi[X,  V<|R]  =  JftrfR-^p.]} 

K 

-tr[R-15[R]R_1  ^  CjXiX^].  (29) 

i=l 

Maximization  of  (23)  results  from  setting  (29)  equal  to  zero. 
Therefore,  the  maximum  likelihood  estimate  of  R  is  given  by 

1  K 

R=-^5>x;xf.  (30) 

i= 1 

Since  R  appears  on  both  sides  of  (30)  (implicitly  on  the 
right-hand  side  through  the  calculation  of  c*),  it  is  not  possible 
to  obtain  a  closed  form  solution  for  the  resulting  estimate. 
Consequently,  an  iterative  method  is  needed  for  calculating  R. 
The  EM  algorithm,  which  provides  an  iterative  solution  to  this 
problem,  is  summarized  below. 

1)  E-Step:  Given  an  initial  estimate  of 

R  denoted  by  R,  calculate  cj  for  i  = 

1,2,...,K. 

2)  M-Step:  Calculate  R  =  (l/AT)]>T=1CiX,xfJ . 

3)  Use  R  from  Step  2  to  calculate  a  new 
value  of  Ci . 

4)  Iterate  until  convergence.  Convergence 
is  determined  through  a  suitable  error 
criterion . 

In  this  paper,  the  convergence  criterion  is  an  error  of  10-6 
defined  to  be  the  Frobenius  norm  of  the  difference  between  the 
values  of  R  resulting  from  two  successive  iterations.  At  con¬ 
vergence,  the  resultant  R  is  to  within  a  multiplicative  constant 
of  the  sample  covariance  matrix.  This  is  due  to  the  fact  that 


the  outer  product  of  each  training  data  realization  with  itself  is 
scaled  by  the  MMSE  estimate  of  V*2.  This  fact  has  been  veri¬ 
fied  for  all  the  simulated  data  examples  presented  in  the  paper. 
In  particular,  we  examined  the  diagonal  matrix  of  eigenvalues 
of  the  estimated  covariance  matrix.  We  found  that  they  were  to 
within  a  multiplicative  constant  of  the  eigenvalues  of  the  sample 
covariance  matrix  formed  by  averaging  the  outer  products  of  the 
realizations  z  pi  =  1, 2, . . . ,  K  of  the  Gaussian  component  of 
the  SIRV  x.;.  Convergence  of  the  algorithm  is  dictated  by  the 
choice  of  the  initial  estimate  of  R.  Any  positive  definite  Hermi- 
tian  matrix  is  suitable  for  the  initial  estimate  of  R.  Two  choices 
that  readily  arise  are  the  M  x  M  identity  matrix  Im  and  the 
sample  covariance  matrix  given  by  S  =  (l/Jf^jiiXjX^.  We 
employ  the  latter  choice  in  this  paper  due  to  the  fact  that  it  yields 
faster  convergence. 

The  simulated  data  examples  in  this  paper  employing  the  co- 
variance  matrix  estimate  of  (3)  involve  a  calculation  of  the  mod¬ 
ified  Bessel  function  of  the  second  kind  for  specifying  /i2m(-) 
and  its  derivative.  Numerical  errors  in  their  calculation  for  a  = 
0.1  tend  to  be  rather  large.  Consequently,  convergence  of  the  al¬ 
gorithm  is  extremely  slow  for  a  =  0.1. 
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